46 research outputs found

    Vehicle Trajectories from Unlabeled Data through Iterative Plane Registration

    Get PDF
    One of the most complex aspects of autonomous driving concerns understanding the surrounding environment. In particular, the interest falls on detecting which agents are populating it and how they are moving. The capacity to predict how these may act in the near future would allow an autonomous vehicle to safely plan its trajectory, minimizing the risks for itself and others. In this work we propose an automatic trajectory annotation method exploiting an Iterative Plane Registration algorithm based on homographies and semantic segmentations. The output of our technique is a set of holistic trajectories (past-present-future) paired with a single image context, useful to train a predictive model

    "Forget" the Forget Gate: Estimating Anomalies in Videos using Self-contained Long Short-Term Memory Networks

    Full text link
    Abnormal event detection is a challenging task that requires effectively handling intricate features of appearance and motion. In this paper, we present an approach of detecting anomalies in videos by learning a novel LSTM based self-contained network on normal dense optical flow. Due to their sigmoid implementations, standard LSTM's forget gate is susceptible to overlooking and dismissing relevant content in long sequence tasks like abnormality detection. The forget gate mitigates participation of previous hidden state for computation of cell state prioritizing current input. In addition, the hyperbolic tangent activation of standard LSTMs sacrifices performance when a network gets deeper. To tackle these two limitations, we introduce a bi-gated, light LSTM cell by discarding the forget gate and introducing sigmoid activation. Specifically, the LSTM architecture we come up with fully sustains content from previous hidden state thereby enabling the trained model to be robust and make context-independent decision during evaluation. Removing the forget gate results in a simplified and undemanding LSTM cell with improved performance effectiveness and computational efficiency. Empirical evaluations show that the proposed bi-gated LSTM based network outperforms various LSTM based models verifying its effectiveness for abnormality detection and generalization tasks on CUHK Avenue and UCSD datasets.Comment: 16 pages, 7 figures, Computer Graphics International (CGI) 202

    Real-time Embedded Person Detection and Tracking for Shopping Behaviour Analysis

    Full text link
    Shopping behaviour analysis through counting and tracking of people in shop-like environments offers valuable information for store operators and provides key insights in the stores layout (e.g. frequently visited spots). Instead of using extra staff for this, automated on-premise solutions are preferred. These automated systems should be cost-effective, preferably on lightweight embedded hardware, work in very challenging situations (e.g. handling occlusions) and preferably work real-time. We solve this challenge by implementing a real-time TensorRT optimized YOLOv3-based pedestrian detector, on a Jetson TX2 hardware platform. By combining the detector with a sparse optical flow tracker we assign a unique ID to each customer and tackle the problem of loosing partially occluded customers. Our detector-tracker based solution achieves an average precision of 81.59% at a processing speed of 10 FPS. Besides valuable statistics, heat maps of frequently visited spots are extracted and used as an overlay on the video stream

    Distinguishing Posed and Spontaneous Smiles by Facial Dynamics

    Full text link
    Smile is one of the key elements in identifying emotions and present state of mind of an individual. In this work, we propose a cluster of approaches to classify posed and spontaneous smiles using deep convolutional neural network (CNN) face features, local phase quantization (LPQ), dense optical flow and histogram of gradient (HOG). Eulerian Video Magnification (EVM) is used for micro-expression smile amplification along with three normalization procedures for distinguishing posed and spontaneous smiles. Although the deep CNN face model is trained with large number of face images, HOG features outperforms this model for overall face smile classification task. Using EVM to amplify micro-expressions did not have a significant impact on classification accuracy, while the normalizing facial features improved classification accuracy. Unlike many manual or semi-automatic methodologies, our approach aims to automatically classify all smiles into either `spontaneous' or `posed' categories, by using support vector machines (SVM). Experimental results on large UvA-NEMO smile database show promising results as compared to other relevant methods.Comment: 16 pages, 8 figures, ACCV 2016, Second Workshop on Spontaneous Facial Behavior Analysi

    4D Match Trees for Non-rigid Surface Alignment

    Get PDF
    This paper presents a method for dense 4D temporal alignment of partial reconstructions of non-rigid surfaces observed from single or multiple moving cameras of complex scenes. 4D Match Trees are introduced for robust global alignment of non-rigid shape based on the similarity between images across sequences and views. Wide-timeframe sparse correspondence between arbitrary pairs of images is established using a segmentation-based feature detector (SFD) which is demonstrated to give improved matching of non-rigid shape. Sparse SFD correspondence allows the similarity between any pair of image frames to be estimated for moving cameras and multiple views. This enables the 4D Match Tree to be constructed which minimises the observed change in non-rigid shape for global alignment across all images. Dense 4D temporal correspondence across all frames is then estimated by traversing the 4D Match tree using optical flow initialised from the sparse feature matches. The approach is evaluated on single and multiple view images sequences for alignment of partial surface reconstructions of dynamic objects in complex indoor and outdoor scenes to obtain a temporally consistent 4D representation. Comparison to previous 2D and 3D scene flow demonstrates that 4D Match Trees achieve reduced errors due to drift and improved robustness to large non-rigid deformations

    Combinatorics of Go

    No full text

    Two-Frame Motion Estimation Based on Polynomial Expansion

    No full text
    This paper presents a novel two-frame motion estimation algorithm. The first step is to approximate each neighborhood of both frames by quadratic polynomials, which can be done efficiently using the polynomial expansion transform. From observing how an exact polynomial transforms under translation a method to estimate displacement fields from the polynomial expansion coefficients is derived and after a series of refinements leads to a robust algorithm. Evaluation on the Yosemite sequence shows good results

    A Theoretical Comparison of Different Orientation Tensors

    No full text
    Orientation tensors is a powerful representation of local orientation. Over the years, several different approaches to estimate the tensors have appeared. The derivations of the different tensors vary to a great extent. This partly obstructs a theoretical comparison between them, which otherwise would be useful when one wants to choose the best tensor for a particular application. This paper shows that all the existing tensors can be derived using a common framework. The derivation is based on signal models and the concept of orientation functionals. The idea is to estimate a signal model and compute a suitable orientation functional in terms of the model parameters. The models used in this paper are polynomial models and quadrature models. This framework may also aid in the design of orientation tensors based on other signal models

    PatchTable

    No full text
    corecore